Arrangement and method for controlling power modes of hardware resources

ABSTRACT

A circuit arrangement, method of executing program code and method of generating program code utilize power control instructions ( 90 ) capable of dynamically controlling power dissipation of multiple hardware resources ( 50 - 60 ) during execution of a program by a processor ( 14 ). Moreover, a processor ( 14 ) configured to process such power control instructions ( 90 ) is capable of maintaining the power modes of the multiple hardware resources ( 50 - 60 ) to that specified in an earlier-processed power control instruction ( 90 ), such that subsequently-processed instructions (90) will be processed while the power modes of the multiple hardware resources ( 50 - 60 ) are set to that specified by the earlier-processed power control instruction ( 90 ).

CROSS REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. provisional application Ser.No. 60/430,884 filed Dec. 4, 2002, which is incorporated herein whole byreference.

The invention is generally related to controlling power dissipation inintegrated circuits, e.g., for use in low power and other powersensitive applications.

Power dissipation is often a principal design constraint for manyintegrated circuits (IC's), or “chips”. Integrated circuits, forexample, are increasingly used in a wide variety of portable and otherbattery-powered applications, such as in mobile telephones and otherwireless communication devices, portable computers, handheld appliancesand game consoles, etc. Moreover, even in non-portable applications,where battery life may not be a concern, integrated circuits can besusceptible to excessive heat, resulting in either the need forexpensive and/or bulky cooling components, or reduced IC reliability.The amount of power dissipated by IC's plays a significant role in bothbattery life and heat generation in an electronic device.

Furthermore, as IC's become more complex, and incorporate faster clockspeeds and greater numbers of transistors, the amount of powerdissipated by these IC's increases proportionately. As such, significantdevelopment efforts have been directed to reducing IC power dissipation.

Some efforts, for example, have been directed to decreasing the powerdissipation of individual transistors in an IC, e.g., through modifyinga transistor's layout and/or reducing supply voltage. To some extent,modifications to transistor designs and decreases in supply voltagelevels have offset some of the increases in power dissipation that haveresulted from the use of more complex and higher performing IC's.Nevertheless, additional reductions have been required for many powersensitive applications.

For example, some IC's such as microprocessors for use in mobileapplications utilize voltage and/or frequency scaling to reduce thesupply voltage and/or clock frequency, and consequently to reduceoverall power dissipation. However, such reductions are typicallyapplied IC-wide, and are accompanied by corresponding reductions inprocessing performance.

Other designs may incorporate sleep modes that place an IC in a lowpower state in response to a particular command or instruction, or anevent such as an external interrupt. For example, some microprocessorssupport WAIT or HALT instructions that place an entire microprocessor ina low power sleep mode. However, when in such a mode, all effectiveprocessing activity in the microprocessor is typically halted until themicroprocessor is reawakened by an interrupt or other triggering event.

In still other designs, IC's may have different circuits that arecapable of being selectively disabled when they are not in use to reduceoverall power dissipation. Typically, such circuits are selectivelyenabled or disabled in response to the execution of particularinstructions that are encountered in association with the execution of acomputer program, particularly for IC's that incorporate some form ofprocessor or processing core.

For example, some low power microprocessor designs permit individualfunctional units to be selectively disabled by routing specific “powerdown” instructions to those functional units. The power downinstructions are inserted by a compiler during compilation of a computerprogram, such that the instructions will be processed by individualfunctional units during execution of the computer program by themicroprocessor. One drawback of such an approach, however, is thatsending individual instructions to specific functional units occupiesthe processing resources of such functional units, and thus decreasesthe availability of such functional units, and of the remainder of theprocessor pipeline, for handling other, productive computationaloperations.

Related to the aforementioned power down instructions is the use ofcontrol bits, associated with each instruction processed by amicroprocessor, that dynamically control the enabled status of eachfunctional unit in the microprocessor. However, in this type of design,constant instruction decoding of each instruction is required, which canoffset some of the power dissipation reductions obtained by selectivelydisabling individual functional units. Moreover, the time required todisable or enable a particular functional unit in response to aparticular set of control bits may limit the feasible operatingfrequency of the microprocessor, thus limiting the overall performanceof the microprocessor. Furthermore, the addition of control bits to eachinstruction increases the size of the code, and thus increases storagerequirements, or decreases the number of different instructions that maybe supported.

Another drawback to the aforementioned instruction-based control schemesis that they are often limited to controlling functional units in amicroprocessor. While functional units, e.g., execution units,arithmetic logic units, floating point units, fixed point units, etc.,do provide a significant contribution to the overall power dissipationin a microprocessor, most designs incorporate a significant amount ofadditional circuitry, e.g., caches, register files, etc., that alsocontribute to power dissipation, but are not addressed by theaforementioned control schemes.

Still other designs may incorporate multiple instruction sets thatsupport different power operations of a microprocessor. Typically, insuch designs, one instruction set may utilize a microprocessor fully,whereas another instruction set may use only a part of themicroprocessor, thus reducing power consumption. One drawback to thisapproach, however, is that only a limited number of modes of processorutilization are supported—i.e., full and restricted. Furthermore, thecomplexity of the code, and thus of the decoding logic in the processor,increases, which itself can increase power dissipation.

In a related field of development, multiple versions of programs may beutilized to support different power dissipation capabilities. However,storing multiple versions of programs requires the use of a run-timescheduler that selects a code version for execution, depending on thecurrent power requirements, and storing multiple code versions requiresmore program memory and execution of the scheduler, which can furtherincrease power dissipation.

With respect to selectively enabling individual circuits in an IC,various manners of disabling these circuits, and thus minimizing theirpower consumption, may be used. For example, clock gating is often usedto disable clocking to a circuit, which effectively limits switching ofthe transistors in the circuit, as it is often the switching of statesin transistors that accounts for the greatest amount of powerdissipation in a circuit.

For functional units in a microprocessor, an alternative manner ofdisabling a circuit is to shutdown the power supply to the circuit. Yetanother alternative manner is to deactivate input signals to thecircuit.

Also, for other circuits, such as memory arrays utilized in dynamicrandom access memory (DRAM) devices, clock signals and sense amplifiersmay be selectively gated off to disable banks of memory cells in anarray, and thus reduce power dissipation in the overall array.

Despite the significant gains that have been made in connection withcontrolling power dissipation in IC's such as microprocessors and thelike, a continuing need exists for further improvements in the field.

For example, in the area of program control of selectively disabledcircuits in IC's, a need still exists for a manner of selectivelydisabling circuits having greater granularity and flexibility, andlesser overhead, than is supported by conventional circuit controlschemes.

The invention addresses these and other problems associated with theprior art by providing a circuit arrangement, method of executingprogram code and method of generating program code that utilize powercontrol instructions capable of dynamically controlling powerdissipation of multiple hardware resources during execution of a programby a processor. Moreover, a processor configured to process such powercontrol instructions is capable of maintaining the power modes of themultiple hardware resources to that specified in an earlier-processedpower control instruction, such that subsequently-processed instructionswill be processed while the power modes of the multiple hardwareresources are set to that specified by the earlier-processed powercontrol instruction.

These and other advantages and features, which characterize theinvention, are set forth in the claims annexed hereto and forming afurther part hereof. However, for a better understanding of theinvention, and of the advantages and objectives attained through itsuse, reference should be made to the Drawings, and to the accompanyingdescriptive matter, in which there is described exemplary embodiments ofthe invention.

FIG. 1 is a block diagram of a media processor incorporating a dynamicpower dissipation control circuit consistent with the invention.

FIG. 2 is a block diagram of the central processing unit referenced inFIG. 1, and including a dynamic power dissipation control circuitconsistent with the invention.

FIG. 3 is a block diagram of the register file referenced in FIG. 2, andincorporating enable logic for selectively disabling banks of registers.

FIG. 4 is a block diagram of an exemplary instruction format for a powercontrol instruction suitable for controlling power dissipation in themedia processor of FIG. 1.

FIG. 5 is a flowchart illustrating the program flow of a powerdissipation optimization routine consistent with the invention.

FIG. 6 is a block diagram illustrating the processing of an exemplaryprogram by the media processor of FIG. 1.

FIG. 7 is a block diagram of an alternate manner of controlling powerdissipation in a register file to that shown in FIG. 2.

Dynamic power dissipation control consistent with the invention mayincorporate either or both of two concepts that provide substantialadvantages over conventional power dissipation control techniques. Thefirst concept applies uniquely to implementing power dissipation controlover a register file utilized by a processor or processing core.Consistent with the invention, a register file is partitioned in tomultiple banks of registers, with each bank of registers includingclock, data and address input lines. Enable logic is coupled to such aregister file to selectively gate off or disable the clock, data andaddress input lines for any unused bank of registers in the registerfile.

The second concept applies more generally to a software-based manner ofcontrolling power dissipation in an integrated circuit that includes aprocessor or other programmable circuit that processes softwareinstructions. In particular, power control instructions incorporatedinto program code executed by a processor are used to control the powermodes of multiple hardware resources that are selectively configurableinto two or more power modes, with each mode for a hardware resourcehaving a particular power consumption state for that hardware resource.

Each power control instruction includes power control informationdisposed in an operand thereof that is capable of setting the powermodes of multiple hardware resources. Furthermore, once the power modesare set by a particular power control instruction, the set power modesare utilized during the processing of subsequent instructions by theprocessor, e.g., until the power modes are reset by another powercontrol instruction, or a special event (e.g., an external interrupt).

Each of these concepts will be described in greater detail in connectionwith a description of an exemplary embodiment of a processor integratedcircuit that utilizes software-based power dissipation control of aregister file. Prior to discussing this specific embodiment, however, anexemplary hardware and software environment is described in greaterdetail below.

Turning to the Drawings, wherein like numbers denote like partsthroughout the several views, FIG. 1 illustrates an exemplary hardwareand software environment for a data processing system 10 including amedia processor 12 that implements dynamic power dissipation controlconsistent with the invention. Media processor 12 may be implemented,for example, as a PNX1300 Series TriMedia-compatible media processoravailable from Philips Semiconductors, or alternatively another mediaprocessor architecture such as the Equator MAP 1000, TI TMS320C6xxx,BOPS ManArray, etc. Media processor 12 is a system-on-chip (SOC)integrated circuit device including a central processing unit (CPU) orprocessing core 14 coupled via an internal bus 16 to a plurality ofadditional circuit components incorporated into the same integratedcircuit device.

CPU 14 may be implemented, for example, as a VLIW processor core, e.g.,incorporating a 32-bit address space and a register file comprising 12832-bit general purpose registers. The processor core includes 27functional units accessible via five issue slots, as well as 16-KB dataand 32-KB instruction caches, with the data cache being dual-ported, andwith both caches being 8-way set associative with a 64-byte block size.

Working memory for processor 12 is supplied by an external memory 18(e.g., SDRAM memory) accessed via a main memory input/output interfaceblock 20, while connectivity to external peripheral components over aperipheral bus 22 (e.g., PCI bus) is facilitated by an external businterface block 24.

To support various media-processing functions, media processor 12includes various specialized media processing circuits, including videoin/out blocks 26, 28; audio in/out blocks 30, 32; SPDIF output block 34;I2C interface block 36; synchronous serial interface block 38; imagecoprocessor block 40; DVD Descrambler (DVDD) block 42; Variable LengthDecoding (VLD) coprocessor block 44; and timers block 46.

Now turning to FIG. 2, CPU 14 is illustrated in greater detail, and isshown including a bus interface unit (BIU) 50 coupling internal bus 16to an instruction cache 52 and a data cache 54. Instructions frominstruction cache 52 are fed to one or more instruction decoders 56,which output instructions to one or more functional units 58, e.g.,various arithmetic logic units, floating point units, subword-parallelmultimedia operation units, load/store units, SIMD multimedia operationunits, vector multimedia operation units, multipliers, branch units,etc. As noted above, CPU 14 may support VLIW instructions, wherebymultiple instruction slots (e.g., five) may be used when processing aVLIW instruction to concurrently route multiple operations encoded intoa VLIW instruction to multiple functional units.

CPU 14 is typically implemented as a load/store architecture, wherebyfunctional units 58 access a register file 60, here including 128 32-bitgeneral purpose registers. An additional supported register is a ProgramControl and Status Word (PCSW) register 61, which is used to set variousconfiguration settings for the CPU, e.g., with respect to floating pointoperations, byte sex (big/little endian), interrupt enables, exceptions,etc. To provide dynamic power dissipation control consistent with theinvention, CPU 14 also includes a power control circuit 62, within whichis illustrated a support register 64, herein referred to as a powermodes register, within which is stored power mode state information forvarious hardware resources in media processor 12 that are capable ofbeing selectively disabled to minimize power dissipation in the mediaprocessor. In some implementations it may be desirable to also supportthe storage of power mode state information in the PCSW or in anothersupport register in the CPU, whereby the power modes state informationis combined with other status information that is unrelated to powerdissipation control.

Power control circuit 62 controls power dissipation throughout CPU 14,and optionally elsewhere in media processor 12, via the assertion of oneor more enable signals 66 to enable logic embedded within varioushardware resources in the CPU 14 and/or media processor 12. In thiscontext, a hardware resource may represent practically any electroniccircuit in an integrated circuit for which is may be practical and/ordesirable to disable for the purpose of reducing power dissipation inthe integrated circuit. In CPU 14, for example, BIU 50, caches 52, 54,instruction decoders 56, functional units 58 and register file 60 areall illustrated as being hardware resources capable of being selectivelydisabled, by virtue of the presence of enable logic (designated by thereference indicator “E”) on each respective block.

In the illustrated embodiment, power control circuit 62 selectivelyasserts enable signals 66 in response to power mode state informationstored in power modes register 64. This state information is set inregister 64 in response to power control instructions that are embeddedwithin a program being executed by CPU 14. These power controlinstructions are typically decoded by instruction decoders 56 and usedto update power modes register 64, in much the same manner as otherconventional register store instructions commonly utilized in the art.

It will be appreciated that the enable signals 66 may be used tocompletely enable/disable an entire block or hardware resource, or maybe used to enable/disable only portions of such blocks/resources and/orto select from among multiple available power consumption states for allor only portions of such blocks/resources. For example, a hardwareresource may support more than two power modes, e.g., a sleep or fullyoff mode, two or more low power or energy saving modes, and a full powermode. While single lines are illustrated for each enable signal 66 inFIG. 2, it will be appreciated that multiple signal paths may beutilized to selectively enable portions of each hardware resource tovarying degrees of granularity. Also, it will be appreciated that ahardware resource may utilize practically any known energy saving orpower dissipation reduction technique known in the art, so long as suchtechnique is capable of being selectively enabled under the control of apower control circuit as described herein.

In the embodiment discussed hereinafter, for example, a hardwareresource such as a register file may be controlled so as to reduce powerdissipation through the organization of the register file into aplurality of register banks that are individually capable of beingselectively disabled in response to power control instructions in aprogram being executed by a processor. In this regard, each registerbank may itself be considered to represent a separate hardware resource.

In the illustrated embodiment, each register bank utilizes enable logicthat gates off not only the clock signal, but also the address and datainput signals to each disabled register bank. As will become moreapparent below, gating off clock, address and data input signalscollectively often provides optimal energy savings as the CMOS latchesor flip-flops utilized in connection with general purpose registers areoften subjected to higher wire capacitances. However, other manners ofdisabling a register file may be utilized in some embodiments consistentwith the invention.

Moreover, while power control circuit 62 is illustrated as being used tocontrol hardware resources disposed solely within CPU 14, it will beappreciated that a power control circuit may be utilized to controlother hardware resources consistent with the invention, including forexample, resources disposed elsewhere on the same integrated circuit(e.g., all or portions of any of the functional blocks-illustrated inFIG. 1), as well as hardware resources disposed on another integratedcircuit. A power control circuit consistent with the invention may beused to control power dissipation, in fact, in connection with a widevariety of hardware resources, including for example, register files,memories, caches, issue slots, busses, functional units, functionalblocks, IO pads or pins, buffers, instruction reorder logic, embeddedfield programmable gate arrays (FPGA's), coprocessors, or practicallyany type of electronic circuit that is capable of being disabled and/orset to a state having a reduced level of power dissipation. Moreover,any of the aforementioned circuits may be considered to incorporatemultiple hardware resources that are separately controllable, e.g., suchthat individual portions of such circuits can be selectively disabled(e.g., individual banks or registers in a register file, individual setsof a cache, individual lines or groups of lines in a bus, etc.)

Furthermore, while power control circuit 62 is illustrated as beingdisposed in CPU 14, it will be appreciated that a power control circuitmay be functionally separate from a CPU or other processing core. Ingeneral, the particular manner in which power dissipation controlfunctionality is allocated on an integrated circuit can vary indifferent embodiments, and as such, the invention is not limited to theparticular implementation discussed herein.

As an additional matter, the implementation of power control circuit 62within a media processor is provided herein by way of example. Powerdissipation control, and in particular, the use of power controlinstructions as described hereinafter, may be utilized in a wide varietyof alternate integrated circuits consistent with the invention. Powercontrol instructions, for example, may be utilized in connection withvarious processor architectures, including VLIW, EPIC, RISC, CISC, DSP,superscalar, etc. Furthermore, the invention is not limited to usewithin SOC architectures, or other architectures where a processing coreis integrated with other supporting circuitry.

In many instances, significant benefits in terms of power dissipationreduction will be realized in connection with VLIW, EPIC, superscalar orother wide-issue architectures where parallel hardware resources may beunder-utilized at all times during program execution, as oftentimesunderutilized parallel hardware resources may be selectively disabledconsistent with program utilization demands. However, the invention isnot limited to applicability solely in connection with wide-issuearchitectures and the like.

In general, it will be appreciated that any of the hardware-basedfunctionality discussed herein is typically implemented in a circuitarrangement incorporated into one or more integrated circuits, andoptionally including additional supporting electronic components.Moreover, as is well known in the art, integrated circuits are typicallydesigned and fabricated using one or more computer data files, referredto herein as hardware definition programs, that define the layout of thecircuit arrangements on the devices. The programs are typicallygenerated by a design tool and are subsequently used duringmanufacturing to create the layout masks that define the circuitarrangements applied to a semiconductor wafer. Typically, the programsare provided in a predefined format using a hardware definition language(HDL) such as VHDL, verilog, EDIF, etc. While the invention has andhereinafter will be described in the context of circuit arrangementsimplemented in fully functioning integrated circuits and data processingsystems utilizing the same, those skilled in the art will appreciatethat circuit arrangements consistent with the invention are also capableof being distributed as program products in a variety of forms, and thatthe invention applies equally regardless of the particular type ofsignal bearing media used to actually carry out the distribution.Examples of signal bearing media include but are not limited torecordable type media such as volatile and non-volatile memory devices,floppy and other removable disks, hard disk drives, magnetic tape,optical disks (e.g., CD-ROMs, DVDs, etc.), among others, andtransmission type media such as digital and analog communication links.In some embodiments consistent with the invention, other integratedcircuit technologies, e.g., FPGA's and the like may also be used toimplement some of the hardware-based functionality discussed herein.

As noted above, power control circuit 62 may be controlled via powercontrol instructions embedded within a program executed by CPU 14. Thesepower control instructions may be generated by a programmer, or in thealternative, may be added to a program in an automated fashion by acompiler, linker, optimizer, etc. Moreover, white such automatedaddition of program control instructions typically occurs prior toruntime, in some embodiments, runtime addition of program controlinstructions may be utilized, e.g., in connection with just-in-timecompilation/optimization or runtime interpretation/instruction decoding.

The automated addition of program control instructions, regardless ofthe particular manner implemented (i.e., prior to or during runtime, ina compiler or optimizer, etc.), is typically implemented using one ormore routines that will be described in greater detail hereinafter.These routines, whether implemented in an operating system or a specificapplication, component, program, object, module or sequence ofinstructions, or even a subset thereof, will be referred to herein as“computer program code,” or simply “program code.” Program codetypically comprises one or more instructions that are, resident atvarious times in various memory and storage devices in a computer ordata processing system, and that, when read and executed by one or moreprocessors in a computer, cause that computer to perform the stepsnecessary to execute steps or elements embodying the various aspects ofthe invention. Moreover, while the software-related aspects of theinvention has and hereinafter will be described in the context ofcomputers and data processing systems, those skilled in the art willappreciate that the various embodiments of the invention are capable ofbeing distributed as a program product in a variety of forms, and thatthe invention applies equally regardless of the particular type ofsignal bearing media used to actually carry out the distribution.

In addition, various program code described hereinafter may beidentified based upon the application within which it is implemented ina specific embodiment of the invention. However, it should beappreciated that any particular program nomenclature that follows isused merely for convenience, and thus the invention should not belimited to use solely in any specific application identified and/orimplied by such nomenclature. Furthermore, given the typically endlessnumber of manners in which computer programs may be organized intoroutines, procedures, methods, modules, objects, and the like, as wellas the various manners in which program functionality may be allocatedamong various software layers that are resident within a typicalcomputer (e.g., operating systems, libraries, APIs, applications,applets, etc.), it should be appreciated that the invention is notlimited to the specific organization and allocation of programfunctionality described herein.

Those skilled in the art will recognize that the exemplary environmentsillustrated in FIGS. 1 and 2 are not intended to limit the presentinvention. Indeed, those skilled in the art will recognize that otheralternative hardware and/or software environments may be used withoutdeparting from the scope of the invention.

As noted above, dynamic power dissipation control consistent with theinvention may incorporate either or both of two concepts that providesubstantial advantages over conventional power dissipation controltechniques. The first concept applies uniquely to implementing powerdissipation control over a register file utilized by a processor orprocessing core. The second concept applies more generally to asoftware-controlled manner of controlling power dissipation in anintegrated circuit. To facilitate a better understanding of each ofthese concepts, an exemplary embodiment incorporating both concepts isdescribed hereinafter in connection with FIGS. 3-6. It will beappreciated, however, that the two concepts discussed hereinafter may beutilized separately and independently in other embodiments, and as such,the invention is not limited to the particular implementation discussedhereinafter.

FIGS. 3-6, in particular, illustrate the software-based control of thepower dissipation in a register file utilized in a TriMedia-compatiblemedia processor. It has been found, for example, that in manyprogrammable architectures (e.g., VLIW, EPIC, and superscalar), registerfiles are a large contributor to overall power consumption. In someapplications, it has been found that a register file may consume up to20% of the power consumed by a processing core. In media processors inparticular, a register file is often comparatively large, and oftenincludes many ports and registers. Some TriMedia-compatible processors,for example, utilize register files with 128 registers and 20 separateports. Register file designs are wiring dominated, so relative powerdissipation increases with technology scaling. As such, reducing powerdissipation in a register file often leads to substantial savings inenergy consumption in a media processor.

It has been found that the size of the register file in any programmablearchitecture is often determined by the applications that demand thehighest number of active variables, which are typically stored in theregister file. However, while executing other applications with feweractive variables, many of the registers in the register file remainunused. Also, within a particular application, the utilization of aregister file can vary substantially at different points in theapplication. As an example, a TriMedia-compatible media processor hasbeen found to have a comparatively high peak register utilization duringAC3 decoding, while during the performance of other operations in atypical application, average register utilization is often relativelylow. Thus, it is believed that, depending on the current requirements ofa running application, it may be highly desirable to disable unusedportions of a register file to reduce its overall power dissipation.

As shown in particular in FIG. 3, one manner of selectively disablingportions of a register file is through partitioning the register file(here register file 60 of FIG. 2) along the address space into severalbanks 70, and then conditionally enabling or disabling these banks,e.g., in response to enable signals 66 supplied to the banks via powercontrol circuit 62 (FIG. 2).

Each bank 70 may include a plurality of registers such that the registerspace represented by the register file is partitioned into sets ofregisters. For example, for a register file that includes 128 registers,it might be desirable to partition the register space into eight banksof 16 registers apiece. While other manners of partitioning registersmay be used, it may be desirable, as an example, to utilize the mostsignificant address inputs as bank select signals, and utilize the leastsignificant address inputs to select a specific register from a selectedbank. For example, where a register file is partitioned into eight 16register banks, seven address inputs may be used, with the three highorder inputs used as bank select signals, and the four low order inputsused as register select signals.

As is well known in the art, register file 60 includes output selectlogic 72, as well as various inputs such as a clock input 74, address ininputs 76 and data in inputs 78. Furthermore, register file 60 outputsdata over data out outputs 80. It will be appreciated that, dependingupon the number of input/output ports supported by the register file, aswell as the number of registers and the width of each register,different numbers of address in, data in and data out signals may besupplied to the register file. In addition, input select logic (notshown) may also be utilized to support concurrent access of multipleregisters by multiple functional units.

To selectively disable a bank of registers, the clock, address in anddata in inputs supplied to each bank through an enable circuit 82disposed in each bank. Furthermore, the enable signals 66 from the powercontrol circuit are additionally fed to each enable circuit 82 toselectively gate off or guard the clock, address in and data in inputsfor the associated register bank 70.

Enable circuit 82 within each bank may be implemented in a number ofmanners consistent with the invention. For example, one manner ofimplementing the enable circuit is through the use of a series of gatetransistors, with one such transistor coupled to each clock, address inand data in input to a register bank, and gated by the enable signal 66dedicated to that register bank.

By gating off the address and data inputs to each bank, and not just theclock input, comparatively greater energy savings are typically obtaineddue to the comparatively high wire capacitances that are associated withCMOS latch or flip-flop (synthesized) register implementations, becausegating typically inhibits switching activity on the relatively longwires utilized within the register file banks. It should be appreciated,however, that in some implementations, the additional gating logic mayintroduce additional delays and may thus inhibit performance to a smallextent. Moreover, it will be appreciated that the invention may beutilized in connection with register files implemented using differentregister implementations.

To implement software-based power dissipation control of register file60, a power control instruction is supported in the instruction setarchitecture of CPU 14. One exemplary format for a power controlinstruction is illustrated at 90 in FIG. 4. As shown, the power controlinstruction 90 may include an opcode 92 that identifies the instructionas a pwr_control instruction, and an operand 94 that specifies powercontrol information used to set the power modes for the various banks ofregisters in register file 60. Operand 94 may be implemented, forexample, as an immediate operand including a bit mask that includes anenable/disable bit 96 assigned to each bank of registers in the registerfile. Thus, for example, for eight banks in a register file, aneight-bit immediate operand may be supported in instruction 90.

At a more general level, the size of operand 94 may be specified by theformula:

$\sum\limits_{i = 1}^{j}{\log_{2}\left( {{Modes}(i)} \right)}$where j is the number of hardware resources to be controlled (here,register banks) and Modes(i) is the number of power modes of hardwareresource i.

Other instruction formats may be used in the alternative. For example,as shown in FIG. 4, it may be desirable to support an optional sourceregister operand 97 that identifies a register within which power modesstate information is stored. Thus, rather than storing the power modesstate information directly in the power control instruction andperforming an immediate operation, the power modes state information canbe maintained in a separate register, with a register operation used toretrieve the desired power modes state information. Other addressingmodes may be supported in the alternative.

As is also shown in FIG. 4, it may also be desirable to support a guardoperand 98, which may be used to specify a condition that must be metbefore the power modes state information specified by the power controlinstruction is applied. Practically any known condition may be usedconsistent with the invention.

Returning to FIG. 2, in the illustrated embodiment, a power controlinstruction is processed by the CPU 14 so as to update a power modesregister 64 with the power control information specified in the powercontrol instruction. As such, it may be desirable for the power modesregister 64 to have an identical mapping to the immediate operand forthe power control instruction, such that power control instruction isprocessed simply as an immediate write to the power modes register.

Moreover, as has been noted above, in some instances it may be desirableto utilize a pre-existing register such as the PCSW register to storepower modes state information. In such instances, a power controlinstruction would not require a separate opcode. Instead, a preexistingopcode used to write to the appropriate register may be used, with theoperand updating those bits that are utilized in connection with storingpower modes state information.

Power control instructions may be incorporated into executable programcode in a number of manners consistent with the invention. For example,power control instructions may be added to source code by a programmerduring development. In the alternative, a compiler, optimizer, linker,etc., may perform simulation or static analysis of a program underdevelopment to determine appropriate locations to insert power controlinstructions based upon predicted resource utilization.

Furthermore, profiling, static analysis or simulation of a program maybe used to determine what hardware resources should be utilized, andwhat resources should be disabled, during certain sections of theprogram. For example, if it is determined that only registers arerequired in a certain section of a program, but that which 10 registersare used is immaterial to program semantics, it may be desirable for acompiler to use registers from only a limited number of register banks,and then insert appropriate power control instructions into the programcode to disable unused register banks. Furthermore, if those 10registers are initially dispersed throughout several register banks, itmay be desirable to remap registers to concentrate the registers withina reduced number of register banks.

FIG. 5, for example, illustrates a power dissipation optimizationroutine 100 that may be executed during compilation or optimization of acomputer program to optimize a section of a computer program for optimalpower dissipation. For each specified section of a program, routine 100first analyzes that section to determine the hardware resource usage bythat section of program code in block 102. Next, block 104 is optionallyexecuted to attempt to remap resources to concentrate resource usageinto a limited set of hardware resources (e.g., to confine registers toa limited number of register banks). Next, block 106 generates andinserts appropriate power control instructions into the program code todisable any unused resources. Processing of the section is thencomplete.

Routine 100, or a routine similar in functionality thereto may also beutilized during runtime, e.g., in connection with interpretation orjust-in-time compilation. Furthermore, it will be appreciated thatroutine 100 may be used in connection with the generation ofinstructions that are scheduled for parallel and/or out-of-orderoperation-during runtime, e.g., in a superscalar processor architecture.In such implementations, a compiler and the runtime hardware desirablyshould limit reordering of power control instructions to minimizeinfluence on other speculated instructions, e.g., by assigning a sideeffect to the power control instruction, which limits its run-timespeculation. In still other embodiments, an operating system canschedule/issue power control instructions if a CPU/processor is notfully loaded with computations.

In the alternative, routine 100 may be used in connection withexplicitly parallel instruction set architectures such as VLIW or EPICcode, where detection of parallel instructions occurs duringcompilation. In such implementations, the insertion of a power controlinstruction may be considered to include the insertion of a powercontrol operation into a larger VLIW or EPIC instruction comprisingmultiple operations.

The program code within which power control instructions have beenembedded during compilation or during runtime desirably includes programcontrol instructions interspersed within the program code only at timesfor which a change in hardware resource utilization is desirable.Moreover, it is often desirable for a single power control instructionto be capable of controlling the enabled/disabled status of multiplehardware resources. As such, minimal additional processing overhead istypically associated with power control instructions consistent with theinvention, thus minimizing any adverse performance effects due to theinsertion of additional instructions into the program code.

FIG. 6, for example, diagrammatically illustrates the execution of anexemplary portion of a program compiled in the manner described above,e.g., as might be utilized in connection with a TriMedia-compatiblemedia processor. In this example, it is assumed that there are fiveissue slots, designated at 110, 112, 114, 116 and 118, with theinstruction executed in each issue slot during each cycle (cycles 0-4)illustrated in its respective issue slot for that cycle. Assume alsothat a register file includes 128 registers (denoted as r0-r1 27)partitioned into eight banks, and that a power control instructionutilizes an immediate operand where a selected bank is enabled when abinary “1” is encountered in the operand bit mask position associatedwith that bank. In this example, the latency of the pwr_controlinstruction is one cycle; however, it will be appreciated that apwr_control instruction can have a latency of more than one cycle insome implementations.

Assume also that during cycle 0, one of the instructions being processedby the CPU (here in issue slot 112) is a power control instruction withan immediate operand of 0x1b (binary 00011011), which enables only banks1, 2, 4 and 5 of the register file (i.e., registers r0-r31 and r48-r79).As a result of the execution of this instruction, power modes register64 (FIG. 2), is updated to store the 0x1b (binary 00011011) value. As aresult, during subsequent cycles, register banks 3, 6, 7 and 8 aredisabled. Note, however, that, during execution of the power controlinstruction in cycle 0, all register banks are available for access byother instructions (assuming that all banks were previously enabled).

During cycles 1 and 2, no further power control instructions areencountered. As a result, register banks 3, 6, 7 and 8 remain disabled,and all instructions are limited to accessing registers from banks 1, 2,4 and 5 (r0-r31 and r48-r79).

During cycle 3, the power modes state information stored in power modesregister 64 continues to maintain register banks 3, 6, 7 and 8 in adisabled state. However, one of the instructions executed during thiscycle (issued to issue slot 118) is a power control instruction with animmediate operand of 0xff (binary 11111111), which enables all eightbanks of the register file for instructions executed in cycle 4.

The aforementioned dynamic power dissipation control techniques providea number of advantages over conventional designs. Significantflexibility is provided in terms of controlling a wide variety ofhardware resources with minimal processing overhead, as compared toconventional designs that require instructions to be routed to specificfunctional units, or that require enable/disable commands to be appendedto every instruction and constantly decoded.

Furthermore, the aforementioned techniques provide the flexibility toaddress various power dissipation-related issues to appropriatelybalance performance and power dissipation in a number of useful ways.For example, using the aforementioned techniques, processor performancecan be scaled up by, for example, adding registers or functional units,but without increasing power dissipation on the code where the extraresources are not utilized. Moreover, for applications where performanceis not as crucial, power control instructions can serve to sacrificeperformance for lower power consumption, e.g., by scheduling operationsto limited resources while disabling other resources.

With respect to software-based power dissipation control, it will beappreciated that alternate instruction formats, alternate compilation,optimization and/or scheduling routines, and alternate processorarchitectures may be utilized in other embodiments consistent with theinvention. Moreover, resource disabling circuitry other than thatdescribed above in connection with the herein-described banked registerfile design may be used in connection with software-based powerdissipation control consistent with the invention.

Furthermore, with respect to the herein-described banked register filedesign, it will be appreciated that control mechanisms other than theherein-described software-based control mechanism may be utilizedconsistent with the invention. For example, a hardware-based controlmechanism may be used to dynamically enable certain hardware resourcesbased upon dynamic decoding of read/write addresses. As an example, FIG.7 illustrates an alternate register file design 120 that includes aplurality of register banks 122, each including gating logic 124 for usein selectively gating off clock inputs 126, data in inputs 128 andaddress in inputs 130 supplied to the register file, similar to registerfile 60 of FIG. 3. However, rather than being provided with enablesignals from a software-based control circuit, register file 120includes a hardware-based enable logic circuit 132 that includes anaddress decoder 134 for dynamically generating individual bank enablesignals 136 for selectively disabling various unused register banks.

Address decoder 134 may, for example, selectively disable register banksduring any cycle for which no registers within such banks are currentlybeing accessed. Particularly for register files that support multipleinput ports, a determination of what registers are being accessed duringa given cycle is readily made, and thus provides the ability toselectively reduce power dissipation in a register file without anychanges in the compiler and/or the instruction set architecture utilizedby a particular processor design.

Various additional modifications will be apparent to one of ordinaryskill in the art having the benefit of the instant disclosure.Therefore, the invention lies in the claims hereinafter appended.

1. A circuit arrangement comprising: a plurality of hardware resources,wherein each hardware resource has a power mode configurable between atleast first and second power consumption states; a processor coupled tothe plurality of hardware resources, the processor configured to processprogram code that includes at least one power control instruction thatincludes an operand having power control information disposed therein,wherein the processor is configured to process the power controlinstruction by selectively setting power modes of at least two hardwareresources among the plurality of hardware resources based upon the powercontrol information disposed in the power control instruction, andmaintain the power modes of the at least two hardware resources to thatspecified in the power control instruction while processing at least onesubsequent instruction in the program code; a support register thatstores power modes state information for the plurality of hardwareresources; and enabling logic coupled to the support register andconfigured to control the power modes of the plurality of hardwareresources responsive to the power modes state information stored in thesupport register, wherein the processor is configured to selectively setthe power modes of the at least two hardware resources by storing thepower control information from the power control instruction in thesupport register.
 2. The circuit arrangement of claim 1, wherein thesupport register comprises a power modes register.
 3. The circuitarrangement of claim 1, wherein the support register includes additionalstatus information that is unrelated to power dissipation control. 4.The circuit arrangement of claim 1, wherein a subset of the plurality ofhardware resources comprises a plurality of banks of registers defininga register file, wherein the enable logic includes a plurality of enablecircuits, each associated with a bank of registers from the plurality ofbanks of registers, and each configured to selectively disable itsassociated bank of registers responsive to an enable signal, wherein theenable logic is further configured to generate the enable signal foreach bank of registers from the power modes state information stored inthe support register.
 5. The circuit arrangement of claim 4, whereineach bank of registers includes at least one clock input, address inputand data input, and wherein the enable circuit for each bank ofregisters is configured to selectively gate off the clock, address anddata inputs for its associated bank of registers in response to theenable signal provided thereto.
 6. A circuit arrangement, comprising: aplurality of hardware resources, wherein each hardware resource has apower mode configurable between at least first and second powerconsumption states; a processor coupled to the plurality of hardwareresources, the processor configured to process program code thatincludes at least one power control instruction that includes an operandhaving power control information disposed therein, wherein the processoris configured to process the power control instruction by selectivelysetting power modes of at least two hardware resources among theplurality of hardware resources based upon the power control informationdisposed in the power control instruction, and maintain the power modesof the at least two hardware resources to that specified in the powercontrol instruction while processing at least one subsequent instructionin the program code; wherein the power control information in theoperand identifies a register within which power modes state informationfor the at least two hardware resources is stored, and wherein theprocessor is configured to selectively set the power modes of the atleast two hardware resources by retrieving the power modes stateinformation from the register identified by the power controlinformation in the operand.
 7. A method of executing program code on aprocessor coupled to a plurality of hardware resources, each having apower mode configurable between at least first and second powerconsumption states, the method comprising: processing a power controlinstruction from the program code by selectively setting power modes ofat least two hardware resources among the plurality of hardwareresources based upon power control information disposed in an operand ofthe power control instruction; and processing at least one subsequentinstruction in the program code while the power modes of the at leasttwo hardware resources are set to that specified by the power controlinformation of the power control instruction, wherein the processorincludes a support register that is utilized by enable logic in theprocessor to set the power modes of the plurality of hardware resources,and wherein selectively setting the power modes of at least two hardwareresources includes storing the power control information in the supportregister.
 8. The method of claim 7, further comprising, after processingthe first subsequent instruction, processing a second power controlinstruction from the program code by storing second power controlinformation disposed in an operand thereof in the support register suchthat the power mode of a first hardware resource is modified, andprocessing a second subsequent instruction after processing the secondpower control instruction, whereby the second subsequent instruction isprocessed while the power mode of the first hardware resource is set tothat specified in the second power control instruction.
 9. The method ofclaim 7, wherein a subset of the plurality of hardware resourcescomprises a plurality of banks of registers defining a register file,wherein the enable logic includes a plurality of enable circuits, eachassociated with a bank of registers from the plurality of banks ofregisters, and each configured to selectively disable its associatedbank of registers responsive to an enable signal, the method furthercomprising generating the enable signal for each bank of registers fromthe power modes state information stored in the support register. 10.The method of claim 9, wherein each bank of registers includes at leastone clock input, address input and data input, and wherein the enablecircuit for each bank of registers is configured to selectively gate offthe clock, address and data inputs for its associated bank of registersin response to the enable signal provided thereto.
 11. A method ofgenerating program code for execution by a processor coupled to aplurality of hardware resources, each having a power mode configurablebetween at least first and second power consumption states, the methodcomprising: (a) analyzing at least a portion of a program to determineutilization of the plurality of hardware resources by the processorduring execution of at least a section of program code from the program;(b) based upon the determined utilization of the plurality of hardwareresources, inserting a power control instruction into the program, thepower control instruction including power control information disposedin an operand thereof that specifies power modes for at least twohardware resources among the plurality of hardware resources, whereinthe power control instruction is configured to be executed prior to atleast one non-power control instruction in the program code, and whereinthe program code is configured to cause the processor to dynamically setthe power modes for the at least two hardware resources to thatspecified in the power control instruction such that the non-powercontrol instruction will be processed while the power modes of the atleast two hardware resources are maintained at that specified in thepower control instruction.
 12. The method of claim 11, wherein analyzingthe program and inserting the power control instruction are performedduring at least one of compilation and optimization of the program. 13.The method of claim 11, wherein analyzing the program and inserting thepower control instruction are performed concurrently with execution ofthe program by the processor.
 14. The method of claim 11, furthercomprising consolidating resource usage in the program to a limitedsubset of hardware resources prior to inserting the power controlinstruction into the program.